SAS Tools for Educational Data Mining

نویسندگان

  • Jennifer Sabourin
  • Scott W. McQuiggan
  • André De Waal
چکیده

Researchers in the EDM community have always relied on sophisticated tools to analyze data and build models. As the amount of data that can be collected and stored grows, the need for tools capable of handling “big data” becomes ever more prevalent. SAS Analytics U is a new initiative for making SAS data analysis and mining tools available for free to educational researchers and instructors. These tools are designed for handling very large data sets and can be run in the cloud, saving researchers valuable time and resources. Furthermore, SAS Analytics U provides a community of SAS educators and learners to share resources and information about SAS tools and techniques. This tutorial aims to introduce researchers to the tools available through SAS Analytics U and how they can be applied to the field of Educational Data Mining. We will provide an overview of the SAS architecture and provide instruction on the key features of each tool in the suite. We will guide participants through examples using relevant educational data sources to help researchers understand how the tools can be applied to their own work. REQUIREMENTS: In order to participate in the hands on exercises, please bring a laptop on which you have installed SAS University Edition. The free download is available at http://www.sas.com/en_us/software/university-edition/downloadsoftware.html. The download and installation may take up to 1 hour so there will not be time to get set up during the tutorial. 1. TUTORIAL DESCRIPTION This tutorial will focus on introducing SAS to participants and guiding them through the use of the suite of tools using relevant educational data sets. The tools that will be covered include: SAS Programming Language. SAS programming language is a powerful language designed specifically for intensive data analysis. This highly flexible and extensible fourth generation programming language has a clear syntax and hundreds of language elements and functions. It supports programming everything from data extraction, formatting and cleansing to data analysis, building sophisticated models, and generating reports. The SAS programming language is at the heart of the SAS University Edition tools. SAS Studio. SAS Studio is the development environment for SAS University Edition and runs through the web browser as well as in the cloud. It offers a powerful GUI interface that allows novice programmers to interact with data and perform analyses without writing any SAS code themselves. However, the SAS code is all generated behind the scenes and is visible to help users learn. SAS Enterprise Miner. SAS Enterprise Miner helps users streamline the data mining process to create highly accurate predictive and descriptive models based on analysis of vast amounts of data. It includes innovative algorithms in the areas of statistics and machine learning to enhance the stability and accuracy of predictions, which can be verified easily by visual model assessment and validation. Users build process flow diagrams that serve as self-documenting procedures. These diagrams can be updated easily or applied to new problems without starting over from scratch. In addition to process flow diagrams, Enterprise Miner provides a programming interface for advanced users. Enterprise Miner allows integration with open source software for data manipulation and model comparison, the open standard PMML, and databases for scoring models without data movement. Additional SAS tools that may be covered if it is of interest to the participants include tools for time series analysis, forecasting, matrix manipulations, and advanced statistics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Data Mining of Market Knowledge in The Pharmaceutical Industry

The pharmaceutical industry is well known for using SAS to perform quantitative analysis for clinical research. Lately, however, competitive pressures have created opportunities in the marketing departments to use SAS for Data Mining. Applications such as sales force planning and direct marketing to doctors and consumers are employing many new SAS tools and techniques. This paper will provide a...

متن کامل

Educational data mining using jmp

Educational Data Mining is a growing trend in case of higher education. The quality of the Educational Institute may be enhanced through discovering hidden knowledge from the student databases/ data warehouses. Present paper is designed to carry out a comparative study with the TDC (Three Year Degree) Course students of different colleges affiliated to Dibrugarh University. The study is conduct...

متن کامل

Evaluation of Clustering Techniques in Data Mining Tools

Clustering divides a heterogeneous population into a number of more homogeneous subgroups or clusters to reflect the segments in a dataset such that patterns can be recognized. This research shows how a software evaluation framework may be adapted to evaluate commercial data mining tools for a specific user environment. We applied this adaptation to evaluate two major commercial data mining too...

متن کامل

Selecting Classification and Clustering Tools for Academic Support

Classification and clustering are powerful and popular data mining techniques. Organizations use them to capture information, retain customers, and improve business performance. This paper presents a method for selecting data mining software for an academic environment based on its classification and clustering tools. This research applies the data mining software evaluation framework to evalua...

متن کامل

An Evaluation Protocol for Text Mining Tools : ALCESTE, SAS Text Miner, SPAD-CRM and Temis Text Mining Solutions Testing

Within the context of the opening of the electricity market, EDF needs to be able to analyse large volumes of text data to enable the company to have a better knowledge of its customers. With this in mind, several text mining tools intended for analysing this very diverse information in large quantities have been evaluated using three different corpora. It appeared essential to create a table t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016